Learning Efficient Algorithms with Hierarchical Attentive Memory
نویسندگان
چکیده
In this paper, we propose and investigate a novel memory architecture for neural networks called Hierarchical Attentive Memory (HAM). It is based on a binary tree with leaves corresponding to memory cells. This allows HAM to perform memory access in Θ(logn) complexity, which is a significant improvement over the standard attention mechanism that requires Θ(n) operations, where n is the size of the memory. We show that an LSTM network augmented with HAM can learn algorithms for problems like merging, sorting or binary searching from pure input-output examples. In particular, it learns to sort n numbers in time Θ(n logn) and generalizes well to input sequences much longer than the ones seen during the training. We also show that HAM can be trained to act like classic data structures: a stack, a FIFO queue and a priority queue.
منابع مشابه
Nearest Neighbor based Clustering Algorithm for Large Data Sets
Clustering is an unsupervised learning technique in which data or objects are grouped into sets based on some similarity measure. Most of the clustering algorithms assume that the main memory is infinite and can accommodate the set of patterns. In reality many applications give rise to a large set of patterns which does not fit in the main memory. When the data set is too large, much of the dat...
متن کاملA Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression
We present memory-efficient and scalable algorithms for kernel methods used in machine learning. Using hierarchical matrix approximations for the kernel matrix the memory requirements, the number of floating point operations, and the execution time are drastically reduced compared to standard dense linear algebra routines. We consider both the general H matrix hierarchical format as well as Hie...
متن کاملScalable Data Mining for Rules
Data Mining is the process of automatic extraction of novel, useful, and understandable patterns in very large databases. High-performance scalable and parallel computing is crucial for ensuring system scalability and interactivity as datasets grow inexorably in size and complexity. This thesis deals with both the algorithmic and systems aspects of scalable and parallel data mining algorithms a...
متن کاملMachine learning algorithms for time series in financial markets
This research is related to the usefulness of different machine learning methods in forecasting time series on financial markets. The main issue in this field is that economic managers and scientific society are still longing for more accurate forecasting algorithms. Fulfilling this request leads to an increase in forecasting quality and, therefore, more profitability and efficiency. In this pa...
متن کاملA Hybrid Optimization Algorithm for Learning Deep Models
Deep learning is one of the subsets of machine learning that is widely used in Artificial Intelligence (AI) field such as natural language processing and machine vision. The learning algorithms require optimization in multiple aspects. Generally, model-based inferences need to solve an optimized problem. In deep learning, the most important problem that can be solved by optimization is neural n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1602.03218 شماره
صفحات -
تاریخ انتشار 2016